Group 005E06
| Field Name | Data Type | Data Format | Length | Description | Example |
|---|---|---|---|---|---|
| Sex | Character | X | 1 | Sex of abalone | M, F, I |
| Length | Float | N.NNN | 4 | Length in mm \(\div\) 200 | 0.455 |
| Diameter | Float | N.NNN | 4 | Diameter in mm \(\div\) 200 | 0.365 |
| Height | Float | N.NNN | 4 | Height in mm \(\div\) 200 | 0.095 |
| Whole_weight | Float | N.NNNN | 5 | Whole weight in g \(\div\) 200 | 0.5140 |
| Shucked_weight | Float | N.NNNN | 5 | Weight of meat in g \(\div\) 200 | 0.2245 |
| Viscera_weight | Float | N.NNNN | 5 | Gut weight in g \(\div\) 200 | 0.1010 |
| Shell_weight | Float | N.NNNN | 4 | Dry shell weight in g \(\div\) 200 | 0.150 |
| Rings | Integer | NN | 2 | No. of rings, +1.5 for age | 15 |
| Original Dataset | Live Abalone Dataset |
|---|---|
| Sex | Sex |
| Length | Length |
| Diameter | Diameter |
| Height | Height |
| Whole_weight | Whole_weight |
| Shucked_weight | Rings |
| Viscera_weight | |
| Shell_weight | |
| Rings |
The scaling factor for each variable is as follows:
\(-(\frac{1}{Rings})^{\frac{1}{4}} = \beta_0 + Sex[M] + Sex [F] + Sex[I] + \beta_1\log_{10}(Length) + \beta_2\log_{10}(Diameter) + \beta_3Height^{\left(\frac{1}{3}\right)}\)
\(\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ + \beta_4\log_{10}(Whole\ Weight) + \beta_5\log_{10}(Shucked\ Weight)\)
\(\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ + \beta_6\log_{10}(Viscera\ Weight) + \beta_7\log_{10}(Shell\ Weight) + \varepsilon_i\)
All variables are highly correlated with all the coefficients of determination ≥ 0.6
The selection process involves forward searching and back searching based on the scaled live and full abalone datasets.
| Backward Live Abalone Model | Forward Live Abalone Model | Backward Original Model | Forward Original Model | |||||
| Predictors | Estimates | p | Estimates | p | Estimates | p | Estimates | p |
| (Intercept) | -0.22 | <0.001 | -0.22 | <0.001 | -0.21 | <0.001 | -0.21 | <0.001 |
| Sex [F] | 0.00 | 0.290 | 0.00 | 0.290 | -0.00 | 0.704 | -0.00 | 0.704 |
| Sex [I] | -0.00 | <0.001 | -0.00 | <0.001 | -0.00 | <0.001 | -0.00 | <0.001 |
| Length | -0.04 | <0.001 | -0.04 | <0.001 | -0.02 | 0.002 | -0.02 | 0.002 |
| Diameter | 0.06 | <0.001 | 0.06 | <0.001 | 0.02 | 0.004 | 0.02 | 0.004 |
| Height | 0.01 | <0.001 | 0.01 | <0.001 | 0.00 | <0.001 | 0.00 | <0.001 |
| Whole weight | 0.01 | <0.001 | 0.01 | <0.001 | 0.05 | <0.001 | 0.05 | <0.001 |
| Shucked weight | -0.05 | <0.001 | -0.05 | <0.001 | ||||
| Viscera weight | -0.01 | <0.001 | -0.01 | <0.001 | ||||
| Shell weight | 0.03 | <0.001 | 0.03 | <0.001 | ||||
| Observations | 4177 | 4177 | 4177 | 4177 | ||||
| R2 / R2 adjusted | 0.543 / 0.543 | 0.543 / 0.543 | 0.656 / 0.656 | 0.656 / 0.656 | ||||
| AIC | -28130.663 | -28130.663 | -29313.170 | -29313.170 | ||||
\(-(\frac{1}{Rings})^{\frac{1}{4}} = -0.2160858 + 0.0003333Sex[F] - 0.0029337Sex [I] -0.0437280\log_{10}(Length)\)
\(+ 0.0554263\log_{10}(Diameter)+ 0.0084491Height^{\left(\frac{1}{3}\right)} + 0.0104607\log_{10}(Whole\ Weight)\)
### Original Abalone Model
\(-(\frac{1}{Rings})^{\frac{1}{4}} = -0.2072139 - 0.0001040Sex[F] - 0.0018633Sex[I] -0.0222445\log_{10}(Length)\)
\(+ 0.0192558\log_{10}(Diameter) + 0.0031554Height^{\left(\frac{1}{3}\right)} + 0.0466486\log_{10}(Whole\ Weight)\)
\(-0.0484394\log_{10}(Shucked\ Weight)- 0.0058606\log_{10}(Viscera\ Weight) + 0.0309122\log_{10}(Shell\ Weight)\)
\(-(\frac{1}{Rings})^{\frac{1}{4}} = -0.2160858 + 0.0003333Sex[F] - 0.0029337Sex [I] -0.0437280\log_{10}(Length)\)
\(+ 0.0554263\log_{10}(Diameter)+ 0.0084491Height^{\left(\frac{1}{3}\right)} + 0.0104607\log_{10}(Whole\ Weight)\)
\(-(\frac{1}{Rings})^{\frac{1}{4}} = -0.2072139 - 0.0001040Sex[F] - 0.0018633Sex[I] -0.0222445\log_{10}(Length)\)
\(+ 0.0192558\log_{10}(Diameter) + 0.0031554Height^{\left(\frac{1}{3}\right)} + 0.0466486\log_{10}(Whole\ Weight)\)
\(-0.0484394\log_{10}(Shucked\ Weight)- 0.0058606\log_{10}(Viscera\ Weight) + 0.0309122\log_{10}(Shell\ Weight)\)
Live Abalone Model (Scaled)
Original Abalone Model (Scaled)
Live abalone data (scaled)
| Rsquared | RMSE | MAE |
|---|---|---|
| 0.541 | 0.008 | 0.007 |
Raw live abalone data (no scaling)
| Rsquared | RMSE | MAE |
|---|---|---|
| 0.364 | 520.228 | 365.98 |
Original data (scaled)
| Rsquared | RMSE | MAE |
|---|---|---|
| 0.654 | 0.007 | 0.006 |
Raw Original data (no scaling)
| Rsquared | RMSE | MAE |
|---|---|---|
| 0.53 | 442.57 | 316.687 |
Live Model (scaled)
| Rsquared | RMSE | MAE |
|---|---|---|
| 0.54 | 0.008 | 0.007 |
Original Model (scaled)
| Rsquared | RMSE | MAE |
|---|---|---|
| 0.652 | 0.007 | 0.006 |
Hurvich, C. M., & Tsai, C. (1989). Regression and time series model selection in small samples. Biometrika, 76(2), 297–307. https://doi.org/10.1093/biomet/76.2.297